Finite mixture spectrogram modeling for multipitch tracking using a factorial hidden Markov model

نویسندگان

  • Michael Wohlmayr
  • Franz Pernkopf
چکیده

In this paper, we present a simple and efficient feature modeling approach for tracking the pitch of two speakers speaking simultaneously. We model the spectrogram features using Gaussian Mixture Models (GMMs) in combination with the Minimum Description Length (MDL) model selection criterion. This enables to automatically determine the number of Gaussian components depending on the available data for a specific pitch pair. A factorial hidden Markov model (FHMM) is applied for tracking. We compare our approach to two methods based on correlogram features [1]. Those methods either use a HMM [1] or a FHMM [7] for tracking. Experimental results on the MochaTIMIT database [2] show that our proposed approach significantly outperforms the correlogram-based methods for speech utterances mixed at 0dB. The superior performance even holds when adding white Gaussian noise to the mixed speech utterances during pitch tracking.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Multipitch tracking using a factorial hidden Markov model

In this paper, we present an approach to track the pitch of two simultaneous speakers. Using a well-known feature extraction method based on the correlogram, we track the resulting data using a factorial hidden Markov model (FHMM). In contrast to the recently developed multipitch determination algorithm [1], which is based on a HMM, we can accurately associate estimated pitch points with their ...

متن کامل

Speaker-dependent multipitch tracking using deep neural networks

Multipitch tracking is important for speech and signal processing. However, it is challenging to design an algorithm that achieves accurate pitch estimation and correct speaker assignment at the same time. In this paper, deep neural networks (DNNs) are used to model the probabilistic pitch states of two simultaneous speakers. To capture speaker-dependent information, two types of DNN with diffe...

متن کامل

Stochastic-deterministic signal modelling for the tracking of pitch in noise and speech mixtures using factorial HMMs

Obtaining estimates of the fundamental frequencies associated with either noise or speech in noise/speech mixtures can be important in speech enhancement. Accurate simultaneous estimation of these can result in both an improved subjective quality as well as a higher signal to noise ratio (SNR) of the resulting speech. It is crucial with such an algorithm that each periodic component be reliably...

متن کامل

A Hierarchical Bayesian Model of Chords, Pitches, and Spectrograms for Multipitch Analysis

This paper presents a statistical multipitch analyzer that can simultaneously estimate pitches and chords (typical pitch combinations) from music audio signals in an unsupervised manner. A popular approach to multipitch analysis is to perform nonnegative matrix factorization (NMF) for estimating the temporal activations of semitone-level pitches and then execute thresholding for making a pianor...

متن کامل

Multipitch Tracking for Noisy and Reverberant Speech

Abstract – Multipitch tracking in real environments is critical for speech signal processing. Determining pitch in reverberant and noisy speech is a particularly challenging task. In this paper, we propose a robust algorithm for multipitch tracking in the presence of both background noise and room reverberation. An auditory front-end and a new channel selection method are utilized to extract pe...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009